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AMENDMENTS TO THE CLAIMS 

This listing of claims will replace all prior listings and versions of claims in the 
application: 

Listing of Claims: 

1 . (Currently Amended) A method for performing a supervised learning process in 
an artificial intelligence environment within a computer system, the method including optimizing 
a database of sample records for the training and testing of a prediction algorithm for predicting 
the presence or absence of a specified medical condition in a patient, the method comprising 
the steps of: 

defining a set of one or more distributions of the database records onto respective 
training and testing subsets; 

using the defined set of distributions to train and test a first generation set of one or more 
prediction algorithms and assigning a fitness score to each, each of said prediction algorithms 
being associated with a certain distribution of said database records; 

feeding the set of prediction algorithms to an evolutionary algorithm which generates a 
set of one or more second generation prediction algorithms and assigns a fitness score to each; 
and 

continuing to feed each generational set of prediction algorithms to the evolutionary 
algorithm until a termination event occurs, wh e r e wherein said termination event is at least one 
of: 

a prediction algorithm [[is]] generated with a fitness score equal to or exceeding a 
defined minimum value, 

the maximum fitness score of successive generational sets of prediction 
algorithms converging to a given value, aftd or 

a certain number of generations having been generated; 
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selecting a prediction algorithm having a best fitness score; and 

using the distribution of database records associated with said selected prediction 
algorithm in performing supervised learning, said supervised learning including training and 
testing of prediction algorithms to obtain a trained prediction algorithm, wherein 

said method is performed using a computer and computer software forming an 
intelligent system, and 

the trained prediction algorithm is effective to predict output variables for data 
relating to said condition, thereby predicting diagnosis of said condition, 

aftd -the method f urther comprising the steps of: 

generating a population of prediction algorithms, whoro wherein each one of said 
prediction algorithms is trained and tested according to a different distribution of the records of 
the data set in the complete database onto a training data set and a testing data set, 

each different distribution being created as one or more of a random distribution 
and a d i str i but i on form e d by a d e t e rm i n i st i c math e mat i ca l proc e ss charact e r i z e d as or a 
pseudorandom distribution, 

each prediction algorithm of [[the]] said population being trained according to its 
own distribution of records of the training set and ffisll being validated in a blind way 
according its own distribution on the testing set, and 

a score reached by each prediction algorithm being calculated in the testing 
phase representing its fitness; 

providing an evolutionary algorithm which combines the different models of distribution 
of the records of the complete data set in a training and in a testing set,, which sets are 
represented each one by a corresponding prediction algorithm trained and tested on the basis 
of [[the]] said training and testing data set according to the fitness score calculated in the 
previous step for the corresponding prediction algorithm, 

the fitness score of each prediction algorithm corresponding to one of the 
different distributions of the complete data set on the training and the testing data sets 
being the probability of evolution of each prediction algorithm or of each said distribution 
of the complete data set on the training and testing data sets; 

repeating the evolution of the prediction algorithm generation for a finite number of 
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generations or till the output of the genetic algorithm converges to a best solution and/or till the 
fitness value of at least some prediction algorithm related to an associated data records 
distribution has reached a desired value; and 

setting the data records distribution for the best solution as the optimized training and 
testing subsets for training and testing prediction algorithm. 

2. (Cancelled) 

3. (Currently Amended) A method according to claim 1,. charactor i sod i n 
tha twherein to each record of the data set a distribution variable is associated which is binary 
and has at least two status statuses , one of this two status statuses being associated with the 
inclusion of the record in the training set and the other in the testing set. 

4. (Currently Amended) A method according to claim 1. charact e r i s e d that w herein 
the prediction algorithm is an artificial neural network. 

5. (Currently Amended) A method according to claim 1 , charact e r i s e d i n 
tha twherein the prediction algorithm is a classification algorithm. 

6. (Currently Amended) A method according to claim charact e r i s e d i n 
tha twherein once an optimum distribution has been computed, the opt i m i s e d optimized training 
data subset is made equal to a complete data set being the individuals included in the training 
subset distributed onto a new training set and onto a new testing set each one having about the 
half of the records of the original optimized training set, while the originally optimized testing set 
is used as a third data subset for validation purposes. 

7. (Currently Amended) A method according to claim 6,. charactor i sod i n 
tha twherein the distribution of the data of the originally optimized training set onto the new 
training and new testing set is optimized by m e ans of through a pre-processing phase including 
the steps of said method for optimizing a database of sample records, said records being 
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records in the originally optimized training set. 

8. (Currently Amended) A method according to claim 1, i n wh i ch wherein different 
choices of the structure of the training subset and the structure of the testing subset consist in 
different selections of the number of input variables of the data records of the database, which 
selections consist in leaving out at least one , pr e f e rab l y two or mor e var i ab le s variable from the 
entire input variable set forming each record, the records of the database comprising a certain 
number of known input variables and a certain number of known output variables. 

9. (Currently Amended) A method according to claim 8, charact e r i s e d by further 
comprising the following steps: 

defining a distribution of data from the complete data set onto a training data set and 
onto a testing data set; 

generating a population of different prediction algorithms each one having a training 
and/or testing data set in which only some variables have been considered among all the 
original variables provided in the data sets, each one of the prediction algorithms being 
generated bv m e ans of through a different selection of variables; 

carrying out learning and testing of each prediction algorithm of the population and 
evaluating the fitness score of each prediction algorithm; 

applying an evolutionary algorithm to the population of prediction algorithms for 
achieving new generations of prediction algorithm; 

for each generation of new prediction algorithms representing each one a new different 
selection of input variable, testing or validating the best prediction algorithm according to the 
best hypothesis of input variables selection is t e st e d or va li dat e d ; and 

evaluating a fitness score i s ova l uatod and promoting the prediction algorithms,, 
representing the selections of input variables which have the best testing performances and the 
minimum input variables,, arc promoted for the processing of the new generations. 

10. (Currently Amended) A method according to claim 8, further comprising a pre- 
processing phase, including the steps of said method for optimizing a database of sample 
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records, for selecting the most predictive input variables. 

1 1 . (Currently Amended) A method according to claim 1 , 

i n wh i ch wherein different choices of the structure of the training subset and the structure 
of the testing subset cons i st i n comprise different selections of the number of input variables of 
the data records of the database, which selections cons i st i n comprise leaving out at least one T 
proforab l v two or moro var i ab l os variable from the entire input variable set forming each record, 
the records of the database comprising a certain number of known input variables and a certain 
number of known output variables, 

and further comprising a pre-processing phase, including the steps of said method for 
optimizing a database of sample records, for selecting the most predictive input variables, 

charact e r i s e d i n that wherein the database subjected to the a pre-processing phase of 
input variable selection is a training subset and a testing subset processed with said method. 

12. (Currently amended) A method according to claim '\ L charact e r i s e d i n 
tha twherein the complete database A the distribution of the records of which has to be optimized A 
has data records having a selected number of input variables, the selection being carried out 
with said method, and in which different choices of the structure of the training subset and the 
structure of the testing subset cons i st i n comprise different selections of the number of input 
variables of the data records of the database, which selections cons i st i n comprise leaving out 
at least one , pr e f e rab l y two or mor e var i ab le s variable from the entire input variable set forming 
each record, the records of the database comprising a certain number of known input variables 
and a certain number of known output variables. 

13. (Currently Amended) A method according to claim charactor i sod i n 
tha twherein a preprocessing phase for optimizing the distribution of the records on a training 
subset and a testing subset and for selecting the most predictive input variables[[,]] is carried 
out alternatively one to the other several times. 

14. (Currently Amended) A method according to claim \^ charact e r i s e d i n 
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tha twherein the evolutionary algorithm is a genetic algorithm with the following evolutionary 
rules: 

an average health value of the population is computed as a function of the fitness values 
of each single individual in the population; 

coupling, recombination of genes and mutation of genes are carried out in a 
differentiated manner depending on a comparison between the fitness of each individual of the 
couple and the average health value of the entire population to which the individuals belong; 

individuals having a fitness value lower or equal to the average health of the entire 
population are not excluded from the creation of new generations but are marked out and 
entered in a vulnerability list; and 

the number of subjects entered in the vulnerability list defining the number of possible 
marriages. 

15. (Currently Amended) A method according to claim 14 , wherein i n wh i ch for 
coupling purposes and for generation of children at least one parent individuals must have a 
fitness value greater than the average health value of the population. 

16. (Currently Amended) A method according to claim 14, charact e r i s e d i n 
tha twherein each couple of individuals can generate i nd i v i dua l s offsprings having a fitness 
different from the average health , so ca lle d offsprings if the fitness of one them, at least is 
greater than the average fitness, the offsprings of each marriage occupying the places of 
subjects entered in the vulnerability list aftd- which are marked out, so that a weak individual can 
continue to exist through his own children. 

17. (Currently Amended) A method according to claim 14, charactor i sod i n 
tha twherein coupling between individuals having a very low fitness value and a very high fitness 
value are not allowed. 

18. (Currently Amended) A method according to claim 14, charact e r i s e d i n 
tha twherein the following recombination rules of the genes of the parents individuals coupled 
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are considered in the case the parents individuals have not common genes: 

the health of father and mother individuals are greater than the average health of the 
entire population; the crossover is a classical crossover according to which the genes of the 
father and of the mother individuals are substituted one with the other starting from a certain 
crossover point; 

the health of father and mother individuals are lower than the average health of the 
entire population ; i n th i s in which case the two children are formed through rejection of the 
parents genes they will receive by the crossover process; 

the health of one of the parents is less than the average health of the entire population 
while the health of the other parent is greater than the average health of the entire population^-m 
thi s, in which case only the parents whose health is greater than the average health of the entire 
population will transmit their genes, while the genes of the parent having an health lower than 
the average health of the entire population are rejected. 

19. (Currently Amended) A method according to claim 18, wherein each gene is 
charact e r i s e d characterized by a status level, th e m e thod further charact e r i s e d i n that and 
wherein genes rejection oons i sts i n comprises modifying the status of the genes from one status 
level to a different status level. 

20. (Currently Amended) A method according to claim 18, charact e r i s e d i n 
tha twherein a modified crossover of the genes of the parents individuals is carried out when the 
parents individuals has part of the genes that coincide, this modified crossover prov i d e s 
providing for generating aftd-an offspring in which the genes selected for crossover are the most 
effective ones of the parents. 

21 . (Currently Amended) A method according to claim 14 , wherein i n wh i ch the 
individuals are the different prediction algorithm representing a corresponding different initial 
random distribution of data records onto the testing and the training data set and the genes 
cons i st i n comprise the binary status variable of association of each record to the training and to 
the testing subset. 
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22. (Currently Amended) A method according to claim 14 , wherein i n wh i ch the 
individuals are the prediction algorithms each one representing a different training and testing 
data set, the difference residing in a different selection of input variables for each different 
training and testing subset, and wherein the genes cons i st i n comprise the different selection 
variable which is provided for each input variable in the different training and testing subsets, 
the above mentioned selection variable being a parameter indicating the presence/absence of 
each corresponding input variable in the records of each data set. 

23. (Currently Amended) A method according to claim 1 , wherein the method 
charactor i zod i n that i t is in the form of a software program comprising instructions executable 
by a CPU, the software program being stored in a memory to wh i ch accessible by the CPU-caw 

24. (Currently Amended) A software program stored on a memory device, wherein 
the said-software program consisting in the method according to claim 1 in the form of [[a]] 
executable instructions of-bv a CPU or of-bv a computer system. 

25. (Currently Amended) A system for carrying out a method according to claim 
comprising: 

an apparatus or device for generating an action of response which is autonomouslyHrer 
by i ts el f, chosen among a certain number of different kinds of actions of response stored in a 
memory of the apparatus or autonomously generated by the apparatus basing the sa i d choice 
of the kind of action of response on the interpretation of data collected autonomously by m e ans 
ef-one or more sensors responsive to physical entities or which are fed to the apparatus by 
moans of input means, tbe-said interpretation being made by moans of through a prediction 
algorithm in the form of [[a]] software saved in a memory of tbe-said apparatus and being 
carried out by a central processing unit, 

charactor i zod i n that wherein the apparatus bo i ng is_further provided with means for 
carrying out a training and testing phase of the prediction algorithm by inputting to the saM 
prediction algorithm data of a known database! in which input variables of the input data 
representing the physical entities ab le to being sensed by the apparatus through the one or 
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more sensors and/or ab le to b e f ed to the apparatus by m e ans of the input means are 
univoquely correlated to at least one definite kind of action of response among the different 
kinds of possible action of response, the said-means for carrying out the training an testing 
being in the form of a training and testing software saved in a memory of the apparatus, the said 
training and testing being carried out by m e ans of a method according to claim 1, the said- 
training and testing software program being the-said method of training and testing in the form 
of a software program or instructions. 

26. (Currently Amended) The system according to claim 25, charact e r i z e d i n that i t 
wherein the system is a system for sound or vocal recognition comprising input means 
responsive to acoustic waves, a processing unit connected to the input means responsive to 
acoustic waves, at least a memory in which a software program is storedi the-said program 
being in the form according to claims 23 or 24 and comprising coded instructions for enabling 
the processing unit to carry out a method according to claim 1, a further or the same above 
mentioned memory, in which a dataset of known data records is stored or can b e stor e d 
storable and/or input means for storing in the further or the-said above mentioned memory a 
dataset of known data records. 

27. (Currently Amended) The system according to claim 25, charact e r i z e d i n that i t 
wherein the system is a system for image recognition, the input means being r e spons i b le 
responsive to electromagnetic waves, the system being akie -configured to recognize the shape 
of an object generating or reflecting electromagnetic waves, and/or the distance and/or the 
identity of the object. 

28. (Currently Amended) The system according to claim 26, charactor i zod i n that 
wherein the database of known data records comprises acoustic signals emitted by one or more 
objects or one or more living beings making part of the typical environment in which the device 
has to operate or the data relating to one or more images of one or more objects or one or more 
living beings making part of the typical environment in which the device has to operate to which 
are univoquely correlated to corresponding known kind, and/or identity and/or meaning of 
objects to which the-said acoustic signals or image data are related and/or from which the-said 
acoustic signals or image data are generated. 
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29. (Currently Amended) The system according to claim 27, charact e r i z e d i n that i t 

-wherein the system is a specialized system for image pattern recognition having 
artificial intelligence utilities for analyzing a d i g i ta li z e d i mag e , i . e . an image in the form of a array 
of image data records, each image data record being related to a zone or point or unitary area 
or volume of a two or three dimensional visual image , so ca lle d p i x el or vox el of a v i sua l i mag e, 
the said-visual image being formed by an array of the said pixels or voxels and utilities for 
indicating for each image data record a certain quality among a plurality of known qualities of 
the image data records; 

the system having a processing unit as for oxamp l o a convont i ona l computer , a memory 
in which an image pattern recognition algorithm is stored in the form of a software program 
which can be executed by the processing unit; 

a memory in which a certain number of predetermined different qualities which the 
image data records eafi -is configured to assume has been stored and which qualities fras -have 
to be univoquely associated to each of the image data records of an image data array fed to the 
system; 

input means for receiving arrays of digital image data records or input means for 
generating arrays of digital image data records from an existing image and a memory for storing 
the-said digital image data array; 

output means for indicating for each image data record of the image data array a certain 
quality chosen by the processing unit in carrying out the image pattern recognition algorithm in 
the form of the-said software program; 

the image pattern recognition algorithm ts -being a prediction algorithm in the form of a 
software program, which prediction algorithm is further associated to a system being further 
provided with a training and testing software program; 

the system is-abJe being configured to carry out training and testing according to the 
method of claim 1; 

the method is- being provided in the system in the form of the training and testing 
software program; and 

a database being also provided in which data records are contained univoquely 
associating known image data records of known image data arrays with the corresponding 
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known quality from a certain number of predetermined different qualities! which the image data 
records cafi -is configured to assume. 



30. (Currently Amended) A method for producing a microarray for genotyping 
operations, the said-method comprising the steps of defining a certain number of theoretically 
relevant genes or alleles or polymorphisms considered relevant for a certain biologic condition 
like a tissue structure, a pathology or the potentiality of developing a pathology or an anatomic 
or morphologic feature , the method comprising : 

[[a)]] providing a database of experimentally determined data in which each record 
relates to a known clinical or experimental case of a sample population of cases i -afld which 
records comprise a certain number of input variables corresponding to the presence/absence of 
a certain predetermined number of polymorphisms and/or mutations and/or equivalent genes of 
a certain number of theoretically probable relevant genes, said certain predetermined number of 
polymorphisms and/or genes forming a set, and one or more related output variables 
corresponding to the certain biological or pathologic condition of the said-clinical and 
experimental cases of the sample population; 

charact e r i z e d by th e fo ll ow i ng further stops: 

[[b)]] determining a selection of a subset of the set of certain predetermined number of 
polymorphisms and/or genes by testing the association of the sa4d-genes or polymorphisms and 
the biological or pathological condition by m e ans of mathematical tools applied to the database; 

[[c)]] the said-mathematical tools compris e comprising a so ca lle d prediction algorithm 
such as a so ca lle d n e ura l n e twork ; 

and th e furth e r st e ps ar e carr ie d out of: 

[[d)]] dividing the database into a training and a testing dataset for training and testing 
the prediction algorithm; 

[[e)]] defining two or more different training datasets each one having records with a set 
of input variables obtained by excluding one or more input variables from the originally defined 
number of input variables, while for each record the set of input variables of the corresponding 
training set has at least one input variable which is not a member of the set of input variables of 
the other training datasets, each said at least one input variable cons i st i ng i n comprising a 
different gene or a different polymorphism^]] and/or a different mutation and/or a different 
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functionally equivalent gene thereof of the originally considered genes or polymorphisms and/or 
mutations and/or functionally equivalent genes thereof considered theoretically potentially 
relevant for the biologic or pathologic condition; 

[[f)]] training the prediction algorithm with each of the different training sets defined und e r 
po i nt e ) at the previous step for generating a first population of different prediction algorithms 
which are divided into two groups of mother and father prediction algorithms and testing the saM 
prediction algorithms with the associated testing set; 

[[g)]] calculating a fitness score or prediction accuracy of each father and mother 
prediction algorithms of the said first population by moans of through the testing results; 

[[i)]] providing a so ca ll ed an evolutionary algorithm such as a genetic algorithm and 
applying the evolutionary algorithm to the first population of mother and father prediction 
algorithms for achieving new generation of prediction algorithms whose training and testing 
dataset comprises records whose input variables selections are a combination of the input 
variable selections of the records of the training and of the testing datasets of the first or 
previous population of father and mother prediction algorithms according to the rules of the 
evolutionary algorithm; 

[0)]] f° r eacn generation of new prediction algorithms representing each new variant 
selection of input variables, the best prediction algorithm according to the best hypothesis of 
input variable selection Msll being tested or validated by m e ans of through the testing dataset; 

[[k)]] evaluating a fitness score i s e va l uat e d and the promoting prediction algorithms 
representing the selections of input variables which have the best testing performance with the 
minimum number of input variables utilized ar e promot e d for the processing of new generations; 

[[I)]] repeating the preceding two steps i) to k) until a predetermined fitness score defined 
as best fit of the prediction algorithm and a minimum number of input variables has been 
reached; and 

[[m)]] defining as the selected relevant input variables , the selected input variables 
comprising i .e. as the relevant genes or polymorphisms and/or of mutations and/or of 
functionally equivalent genes thereof the ones related to the input variables of the selection 
represented by the prediction algorithm having both at least the predetermined fitness score and 
also the minimum number of selected input variables. 
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31 . (Currently Amended) A method according to claim 30, charact e r i z e d i n that 
wherein an optimization of the distribution of the records of the original database in a training 
dataset and in a testing dataset is carried out in one of a pre processing and a post processing 
phase , i . e . b e for e carry i ng out th e st e ps e ) to m) at st e p d) or aft e r hav i ng carr ie d out th e st e ps 

32. (Currently Amended) The method according to claim 31 . further comprising the 
following steps of opt i m i sat i on optimization : 

defining a set of one or more distributions of the database records onto respective 
training and testing subsets; 

using the defined set of distributions to train and test a first generation set of one or more 
prediction algorithms and assigning a fitness score to each; 

feeding the set of prediction algorithms to an evolutionary algorithm which generates a 
set of one or more second generation prediction algorithms and assigns a fitness score to each; 
and 

continuing to feed each generational set of prediction algorithms to the evolutionary 
algorithm until a termination event occurs; 

wh e r e wherein said termination event is at least one of a prediction algorithm is 
generated with a fitness score equalling or exceeding a defined minimum value, the maximum 
fitness score of successive generational sets of prediction algorithms converging to a given 
value, and a certain number of generations having been generated. 

33. (Currently Amended) The method according to claim 31 , further comprising the 
following steps: 

generating a population of prediction algorithm each ono of thorn i s trained and tested 
according to a different distribution of the records of the data set in the complete database onto 
a training data set and a testing data set; 

each different distribution being created by a random distribution or a distribution formed 
by a d e t e rm i n i st i c math e mat i ca l proc e ss charact e riz e d as a pseudo-random distribution; 

each prediction algorithm of the said population is -beinq trained according to its own 
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distribution of records of the training set and is -beinq v alidated in a blind way according its own 
distribution on the testing set; 

a score reached by each prediction algorithm is -being calculated in the testing phase 
representing its fitness; 

an evolutionary algorithm being further provided which combines the different models of 
distribution of the records of the complete data set in a training and in a testing set A which sets 
are represented each one by a corresponding prediction algorithm trained and tested on the 
basis of the said training and testing data set according to the fitness score calculated in the 
previous step for the corresponding prediction algorithm; 

the fitness score of each prediction algorithm corresponding to one of the different 
distributions of the complete data set on the training and the testing data sets being the 
probability of evolution of each prediction algorithm or of each said distribution of the complete 
data set on the training and testing data sets; 

repeating the evolution of the prediction algorithm generation for a finite number of 
generations or till the output of the genetic algorithm converges to a best solution and/or till the 
fitness value of at least some prediction algorithm related to an associated data records 
distribution has reached a desired value; and 

setting the data records distribution for the best solution as the optimized training and 
testing subsets for training and testing prediction algorithm. 

34. (Currently Amended) A microarray for genotyping comprising a reduced number 
of genes, alleles or polymorphisms^, charact e r i z e d in that wherein the reduced number of the 
said genes, alleles or polymorphisms has been selected by m e ans of with a method according to 
claim 30. 

35. (Currently Amended) A method for performing a supervised learning process in 
an artificial intelligence environment including optimizing a database of sample records for the 
training and testing of a prediction algorithm for a problem under investigation characterized by 
input variables and output variables, the prediction algorithm used for predicting output variables 
for real world data, the method comprising the steps of: 

defining a set of one or more distributions of the database records onto respective 
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training and testing subsets; 

using the defined set of distributions to train and test a first generation set of one or more 
prediction algorithms and assigning a fitness score to each, each of said prediction algorithms 
being associated with a certain distribution of said database records; 

feeding the set of prediction algorithms to an evolutionary algorithm which generates a 
set of one or more second generation prediction algorithms and assigns a fitness score to each; 

continuing to feed each generational set of prediction algorithms to the evolutionary 
algorithm until a termination event occurs, where said termination event is at least one of: 

a prediction algorithm is generated with a fitness score equal to or exceeding a 

defined minimum value, 

the maximum fitness score of successive generational sets of prediction 
algorithms converging to a given value, a«d_or 

a certain number of generations having been generated; 

selecting a prediction algorithm having a best fitness score; 

using the distribution of database records associated with said selected prediction 
algorithm in performing supervised learning, said supervised learning including training and 
testing of prediction algorithms to obtain a trained prediction algorithm; and 

using the trained prediction algorithm to predict the output variables relating to the 
problem under investigation where only the input variables are known, 

wherein said method is performed using a computer and computer software 
forming an intelligent system; 

the method further comprising the steps of: 

generating a population of prediction algorithms, wherein each one of said prediction 
algorithms is trained and tested according to a different distribution of the records of the data set 
in the complete database onto a training data set and a testing data set, 

each different distribution being created as one or more of a random distribution 
or a pseudorandom distribution, 

each prediction algorithm of ffthell said population being trained according to its 
own distribution of records of the training set and Msll being validated in a blind way 

Page 1 7 of 23 



S/N 10/542,208 
Request for Continued Examination 

according its own distribution on the testing set, and 

a score reached by each prediction algorithm being calculated in the testing 
phase representing its fitness; 

providing an evolutionary algorithm which combines the different models of distribution 
of the records of the complete data set in a training and in a testing set which sets are 
represented each one by a corresponding prediction algorithm trained and tested on the basis 
of rrthell said training and testing data set according to the fitness score calculated in the 
previous step for the corresponding prediction algorithm, 

the fitness score of each prediction algorithm corresponding to one of the 
different distributions of the complete data set on the training and the testing data sets 
being the probability of evolution of each prediction algorithm or of each said distribution 
of the complete data set on the training and testing data sets; 

repeating the evolution of the prediction algorithm generation for a finite number of 
generations or till the output of the genetic algorithm converges to a best solution and/or till the 
fitness value of at least some prediction algorithm related to an associated data records 
distribution has reached a desired value; and 

setting the data records distribution for the best solution as the optimized training and 
testing subsets for training and testing prediction algorithm . 
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